Statistical Modelling under Epistemic Data Imprecision: Some Results on Estimating Multinomial Distributions and Logistic Regression for Coarse Categorical Data
نویسندگان
چکیده
The paper deals with parameter estimation for categorical data under epistemic data imprecision, where for a part of the data only coarse(ned) versions of the true values are observable. For different observation models formalizing the information available on the coarsening process, we derive the (typically set-valued) maximum likelihood estimators of the underlying distributions. We discuss the homogeneous case of independent and identically distributed variables as well as logistic regression under a categorical covariate. We start with the imprecise point estimator under an observation model describing the coarsening process without any further assumptions. Then we determine several sensitivity parameters that allow the refinement of the estimators in the presence of auxiliary information.
منابع مشابه
Statistical Modelling in Surveys without Neglecting The Undecided: Multinomial Logistic Regression Models and Imprecise Classification Trees under Ontic Data Imprecision
In surveys, and most notably in election polls, undecided participants frequently constitute subgroups of their own with specific individual characteristics. While traditional survey methods and corresponding statistical models are inherently damned to neglect this valuable information, an ontic random set view provides us with the full power of the whole statistical modelling framework. We ela...
متن کاملMultinomial logit - Wikipedia, the free encyclopedia
In statistics, a multinomial logit (MNL) model, also known as multinomial logistic regression, is a regression model which generalizes logistic regression by allowing more than two discrete outcomes.[1] That is, it is a model that is used to predict the probabilities of the different possible outcomes of a categorically distributed dependent variable, given a set of independent variables (which...
متن کاملMultiple Logistic Regressio
A common problem in software cost estimation is the manipulation of incomplete or missing data in databases used for the development of prediction models. In such cases, the most popular and simple method of handling missing data is to ignore either the projects or the attributes with missing observations. This technique causes the loss of valuable information and therefore may lead to inaccura...
متن کاملBayesian Inference for Poisson and Multinomial Log-linear Models
Categorical data frequently arise in applications in the social sciences. In such applications,the class of log-linear models, based on either a Poisson or (product) multinomial response distribution, is a flexible model class for inference and prediction. In this paper we consider the Bayesian analysis of both Poisson and multinomial log-linear models. It is often convenient to model multinomi...
متن کاملComparing Discriminant Analysis, Ecological Niche Factor Analysis and Logistic Regression Methods for Geographic Distribution Modelling of Eurotia ceratoides (L.) C. A. Mey
Eurotia ceratoides (L.) C. A. Mey is an important plant species in semi-arid landsin Iran. New approaches are required to determine the distribution of this plant species. Forthis reason, geographical distributions of Eurotia ceratoides were assessed using threedifferent models including: Multiple Discriminant Analysis (MDA), Ecological Niche FactorAnalysis (ENFA) and Logistic Regression (LR). ...
متن کامل